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Abstract 

A multiple-antenna amplify-and-forward two-hop interference network with multiple links and mul- 
tiple relays is considered. We optimize transmit precoders, receive decoders and relay AF matrices to 
maximize the achievable sum rate. Under per user and total relay sum power constraints, we propose 
an efficient algorithm to maximize the total signal to total interference plus noise ratio (TSTINR). 
Computational complexity analysis shows that our proposed algorithm for TSTINR has lower complexity 
than the existing weighted minimum mean square error (WMMSE) algorithm. We analyze and confirm by 
simulations that the TSTINR, WMMSE and the total leakage interference plus noise (TLIN) minimization 
models with per user and total relay sum power constraints can only transmit a single data stream for each 
user. Thus we propose a novel multiple stream TSTINR model with requirement of orthogonal columns 
for precoders, in order to support multiple data streams and thus utilize higher Degrees of Freedom. 
Multiple data streams and larger multiplexing gains are guaranteed. Simulation results show that for 
single stream models, our TSTINR algorithm outperforms the TLIN algorithm generally and outperforms 
WMMSE in medium to high Signal-to-Noise-Ratio scenarios; the system sum rate significantly benefits 
from multiple data streams in medium to high SNR scenarios. 
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I. Introduction 

Relays are often used to aid communications, not only to improve the Quality of Service (QoS) of 
the user pairs, which have weak direct links due to poor channel conditions, but also to increase the 
multiplexing gain of the network (T). Among various relay transmit schemes, the most effective ones are 
Amplify-and-Forward (AF), Compute-and-Forward (CF) and Decode-and-Forward (DF). Especially AF 
protocol is standardized as layer 1 relaying 0, and thus in popular research, because of its simplicity 
and low complexity. 

In this paper we consider the multiple link multiple relay network with non-regenerative relaying. There 
has been many works discussing the optimization of the relay beamforming weights. For single antenna 
case, Q and flU study models to solve the optimal relay AF weights, where total relay transmit power is 
minimized under guaranteed Signal-to-Interference-plus-Noise-Ratio (SINR) requirements. There is also 
extension to multiple antenna case. In Q the authors explore the network with one multiple-antenna relay, 
and according to various relay AF matrix schemes, proposes "IRC FlexCoBF" algorithm for transmit 
beamforming matrices. For the networks with one user pair and parallel relays, [6) discusses the joint 
optimization of source and relay beamforming with different receiver filters. With the similar MIMO 
relay network as O, investigates the optimal joint source and relay power allocation to maximize 
the end-to-end achievable rate. Extended to one transmitter, multiple receiver and multiple relay network, 
j8l proposes a weighted mean square error minimization (WMMSE) model to solve the source and 
relay beamforming matrices with MMSE receiving filter. Recent work of @ is based on general MIMO 
AF relay networks with multiple links and multiple relays. The authors provide algorithms to jointly 
optimize users' precoders, decoders and the relay AF matrices. Total leakage interference plus noise 
(TLIN) minimization and WMMSE models are proposed, both with per user and total relay transmit 
power constraints. The idea to construct the WMMSE models in (H and (9l are similar. The WMMSE 
model in Q is also extended to that with individual user and individual relay power constraints. The 
precoders, decoders and the relay AF matrices are solved alternatively, where each subproblem can 
achieve its optimal solution. However the algorithm has quite high computational complexity. Here we 
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propose different approaches to approximate the system sum rate and derive lower complexity algorithm 
to solve the corresponding optimization problem. 

The Degrees of Freedom (DoFs) of one network is closely related to its channel capacity. In high Signal- 
to-Noise-Ratio (SNR) scenarios, the capacity increases linearly with the number of DoFs. The authors 
in ifTOl propose a new technique of Interference Alignment (IA), to maximize the achievable DoFs for 
MIMO networks. Such technique optimizes the precoders and decoders, in order to eliminate the network 
interference and approach the capacity of MIMO network. In MIMO networks, the IA technique has been 
deeply investigated lfTT1 - lfl4l . It is shown in lff5l that, with relays the achievable DoFs of the MIMO 
interference network are increased, and the capacity as well as the reliability are improved. In two-hop 
networks, lfT6l studies the feasibility conditions and the algorithms for relay aided IA, restricted on single 
antenna case. Aiming to achieve the maximum DoFs of the 2x2x2 MIMO relay network, flTl and |[T8l 
study similar technique of aligned interference neutralization to explore the optimal transmission scheme 
for single antenna and multiple antennas cases, respectively. (I9\i investigates the ergodic capacity of a 
class of fading 2-user 2-hop networks with interference neutralization technique. In |[20l the maximum 
achievable DoFs for different kinds of MIMO interference channels and MIMO multiple hop networks 
are listed and concluded. For the general K x R x K MIMO relay network, the maximum DoFs are 
only analyzed with restriction to the number of relays. Interestingly, we observe by simulations that the 
algorithms proposed in O all lead to precoders with linearly dependent columns, which result in single 
transmit data stream corresponding to one DoF for each user, regardless of the number of antennas at 
relay and user nodes. This is an impetus for us to propose multiple data stream models. 

In our paper, we propose several models for the general MIMO relay network, according to different 
purposes and situations. The general transmit process and system model are introduced in Section II. In 
Section III we set up a Total Signal to Total Interference plus Noise Ratio (TSTINR) maximization model 
to approximate the system sum rate, with per user and total relay transmit power constraints. Then this 
TSTINR model as well as the TLIN and WMMSE model in (9l are extended to those with individual 
user and individual relay power constraints. Also, the computational complexity of our algorithm is 
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analyzed and compared with the WMMSE algorithm in J9). Our proposed algorithm is shown to have 
lower complexity. Furthermore, to achieve more than one data streams for each user, we propose a 
multiple stream TSTINR model in Section IV. Compared to the TSTINR model in Section III, additional 
orthogonal constraints for precoders are added. In all the models, the precoding matrices, decoding 
matrices and relay beamforming matrices are iterated alternatively. Each subproblem is efficiently solved, 
with sufficient reduction of the objective function in each iteration guaranteed. We provide simulation 
results in Section V. Since the network model in [9] is more general than (S), we compare our proposed 
algorithms with those in O. The results indicate that TSTINR outperforms TLIN generally, and achieves 
higher sum rate than WMMSE in medium to high SNR scenarios, for the single stream cases. The system 
sum rate benefits much from the multiple stream model in medium and high SNR scenarios. Parts of our 
work are reported in ll2Ti and |[22l . Compared to them, we add more details of the proposed algorithms 
and provide detailed proof for all the mentioned theorems. Furthermore, we analyze and compare the 
detailed computational complexity of our proposed algorithm and the WMMSE algorithm from Q. 

Notation: Lowercase and uppercase boldface represent vectors and matrices, respectively. C represents 
the complex domain. Re(a) means the real part of scalar a. tr(A) and ||A||j7 are the trace and the 
Frobenius norm of matrix A, respectively. 1^ represents the dx d identity matrix. K and 1Z represent the 
set of the user indices {1,2,..., K} and that of relay indices {1,2,..., R}, respectively. And we use 
E(-) to denote the statistical expectation. 0(n) means the same order amount of n. z/^ in (A) is composed 
of the eigenvectors of A corresponding to its d smallest eigenvalues. 

II. System model 

Consider a two-hop interference channel consisting of K user pairs and R relays as in Fig. 1. 
Transmitter k, Receiver k and Relay r are equipped with Nk and L r antennas, respectively, for 
any k € /C, r G 1Z. User k wishes to transmit parallel data streams, € c^* 1 denotes the transmit 
signal vector of User k, where E(sfcS^) = Li fc . Due to the poor channel conditions between user pairs, 
there is no direct links among users. Low-complex relays aid to communicate and the AF transmit protocol 
is used. Here we assume perfect channel state information (CSI) is available at a central controller. 
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Transmission process includes two time slots. In the first time slot, all sources transmit signals to all 
relays. Relay r receives x r = ^ k£l Q £*rtJJk s k + n r , for all r € 1Z, where € C MkXdk is the precoding 
matrix of User k, G rk S C LrXMk is the channel coefficient between the Transmitter k and Relay r, and 
n r with zero mean and variance matrix a\lL r is the noise at Relay r. In the second time slot, by the 
AF protocol all relays broadcast to all destinations t r = W r x r , for all r € 1Z, where W r G ^L r *.L r ^ 
the beamforming matrix of Relay r. 

Receiver k observes: 

Yk = ^2 H-kr^r + Zfc, 
reTl 

for all k € /C, where Hfc r G £,N k xL T j s ^ c h anne i coefficient between Relay r and Receiver k, and 
with zero mean and variance matrix ct^Ia^ is the noise at Receiver k. Multiplying the decoding matrix 
V fe € C NkXdk , Receiver k obtains: 

y fc = V^T fcfc s fc + Vf T fcg s g + ^ Vf H fcr W r n r + z fc . (1) 

interference noise 

The right hand side of £[]) contains three terms: the desired signal, the interference from other users and 
the noise including relay enhanced noise and the local noise. The effective channel from Transmitter k 
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to Receiver q is given by T kq = X]re7£ Hfc r .W r G r? U 9 . Suppose all the transmit signals and noise in the 
system are independent of each other. The transmit powers at each user and each relay are, respectively: 

if = E(||U fc s fc ||!0 = tr(Uf U fc ), k E K, 

P r R = E(\\t r \\ 2 F ) = ||W r G rfc U fc |||. + (^||W r |||.,r€7i. 
keic 

Then the total relay transmit power is P R = J2ren ^r"- 

In the following two sections we propose different models with corresponding algorithms to obtain 
efficient system precoders, decoders and relay beamforming matrices. For the sake of expression sim- 
plicity, we predefine some symbols here: precoded and decoded effective channel from Transmitter k to 
Relay r as G rk = G r fcU& and ~W rk = W r G r k, respectively; precoded and decoded effective channel 
from Relay r to Receiver k Hfc r = Hfc r W r and Y kr = V^Hfc r , respectively , k G /C, r G 1Z. 

III. Total Signal to Total Interference plus Noise Ratio model 

In this section, we develop a new model to approximate sum rate maximization. A low complexity 
algorithm to optimize the users' precoders, decoders and relay beamforming matrices is proposed. And 
its computational complexity is analyzed. 

A. A new model with per user and total relay power constraints 

First, we set up the new model of maximizing TSTINR with per user and total relay transmit power 
constraints. 

ps V P s 

1 ) Optimization problem formulation: Define TSTINR = P i +P n = — ^pV+piV) . where 

Pi = E(||VfT fcfc s fc ||!0 = llVf ^H fcr W r G rfe U fc j| F , (2) 

ren 

Pi = E(|| £ Vf T kqSq f F ) = £ ||Vf Y, U ^W r G rq V q f F , (3) 

q£)C,q^k qeK.,q^=k ren 

if = E(||^VfH fcr W r n r + Vfz fc || F ) = ^^||VfH fcr W r || F + C T 2 2 ||V fc || F (4) 
ren ren 

are the desired signal power, the leakage interference and the noise power at Receiver k, respectively. 



We wish to maximize the system sum rate 

tfsum = \ lo g2det(Ijv fc + F^T^Tg) (5) 

keK. 

with F fe = J2 q ^k, re iz Hfe r H fcr + crfljv fe . The direct optimization of the system sum rate 

is complicated. Therefore, we approximate it by the TSTINR and maximize the TSTINR instead. As the 
TSTINR remains invariant with V& replaced by V^Q, where Q is any ci-dimensional unitary matrix, we 
require the decoders k G K to be orthogonal, as the bases of the d-dimensional solution subspaces. 
Also, the following theorem holds: 

Theorem 1: For any precoder k G /C, relay beamforming matrix W r , r G 1Z and any decoder V& 
satisfying Vf V fc = A; G /C, we have log 2 [l + TSTINR({U}, {V}, {W})] < i? sum ({U}, {W}). 
The detailed proof is shown in Appendix-A. This states that the result from maximizing TSTINR provides 
a guaranteed system throughput. Besides the orthogonality constraints of decoders, we add fixed transmit 
power constraint^!! for per user and total relay. Then the corresponding optimization problem is: 

max TSTINR = — ^ elc . k ^ (6a) 

{W} 

s.t. VfV fc = I dfe , (6b) 

llUfeHl =Po,k G K, (6c) 

£(X)||W r G rfc U fc |||. + a?||W r |||0 =p^ x . (6d) 

2) Problem reformulation: There is a lack of efficient methods to deal with (l6al > because it is a 
fraction. This makes problem © difficult to solve. Stimulated by Dinkelbach's work |[23l for nonlinear 
fraction optimization problem on convex sets, we use a parameter C to combine the denominator and 
the numerator as the new objective function, whereas the conclusions in |231 cannot be extended to the 

'Fixed power constraints mean that all the power constraints are equality constraints, which are called constraints without 
power control in J5J. 



problem © with nonconvex feasible set. Reformulate © as follows: 

min } /({U}, {V}, {W}; C) = C(P 7 + P N ) - P s = £ [C(P[ + Pf ) - Pg] (7a) 

{w} ' fce/c 

s.t. VfV fc = I dfc , (7b) 

tr(UfU fc )=pg>e/C, (7c) 

||W r G rfc U fc |||. + (7?||W r ||^) =^ax- (7d) 

rG7£ fce/C 

Thus in each iteration we solve ([/J, and then update the parameter C as follows: initially C is set as a 
small positive scalar (for example C = 1), then after each iteration it is updated as 

r= P g ({U},{V},{W}) 

p'({U}, {v}, {w» + p^({u}, {v}, {w» • 1 ; 

With such updating strategy of C, we have the following theorem, which is proved in Appendix-B: 
Theorem 2: If the objective function of fTJ) has sufficient reduction in each iteration and C is updated 

as d8]), then the objective function of ©, TSTINR, is monotonically increasing. Any stationary point of 

© is also a stationary point of ©. 

3) Alternating minimization algorithm: The programming d7]) itself is a nonconvex nonlinear matrix 

optimization problem, which is difficult to handle jointly. Thus we solve precoders U^, k £ fc, decoders 

Vfc , k € fc and relay beamforming matrices W r , r G IZ alternatively. Efficient algorithms are developed 

for each subproblem. 

Firstly, we fix \Jk,k € fc and W r ,r € 1Z, then all Vfc,A; E fc are independent of each other. The 
subproblem for becomes: 

min tr(X H AX) 

s.t. X H X = I dfc , (9) 

where X represents variable and A = CF^ — T^T^. Since A is Hermitian, we obtain the closed 
form solution of © as X = iA n (A). 



Next, we solve the subproblem for W r . Given a certain index r € 1Z, we fix Ufc, Vfc,fc S /C and 
{W_ r }. Thus the optimization subproblem for W r is: 

x £lV E tr t X ( P - + ^?IiJX^Vf r V fcr ] + 2Re[ £5jr(XP* Wf V^V fcr )] 

s.t. tr[X(£ G rfc Gf fc + ffJljJX*] = m , (10) 

fce/c 

where P* = C Z& k , q eic G rg Gg-G rfc G^, k E K.,r, I E TZ and 771 = p^-E,^,^ ( E fce /c II W,G» | 
c? ||Wj|||i). Problem (fTOl is equivalent to a specific Quadratic Constrained Quadratic Programming 
(QCQP) with x = vec(X): 

min /(x) = x^Bix + h H x + x^b (1 la) 

XGC L ? X1 

s.t. x H B 2 x = 7?i. (lib) 

Here B 1 = EkeA F rr + Cafl Lr ) T ® (VgV kr ), B 2 = (£ fce)C G rfe Gf, + afl L J T ® I L and b = 
vec (EfcG,/c Sz^r,ZG7^^^ WiP w)- From tne expressions, we know that B 2 >- and generally Bi 
is indefinite. Here we also discuss the case that Bi is positive semi-definite. First we show that the 
following theorem holds: 

Theorem 3: Given B 2 = Q^Q, Q y 0, p = Qx, Bi = Q^BiQ 1 and b = Q^b, CED is 
equivalent to min p ^B lP + b H p + p H b. (12) 

P H P=Vi 

Further if Bi is indefinite, (TTTb is equivalent to: 

min p H Bip + b H p + p H b. (13) 

For the case of positive semi-definite Bi: if the optimal solution of ( fTBl po is not that of (fT2b . then 
Po = (Bi)-^. 

Proof: It is trivial to prove that (fTTT) is equivalent to (fT2l) and thus the detailed proof is omitted. 
The global optimality conditions of ( fT3l are as follows: there exists A > 0, such that Bi + AIf,2 >; 0, 
p*(A) = (Bi + AIi2) _1 b, A(||p*||| - 771) = and ||p*(A)||| < 771. For the case that B x is indefinite, it 



follows that Bi is also indefinite. Thus it must hold that A > 0. Then from the complementary optimality 
condition we must have [ | p>* (A) 1 1 § = 771. Such A and p* satisfy the global optimal condition of (fT2l ). 
Therefore to solve (fTTb is equivalent to solve (TT3T ) with indefinite Bi. 

If the optimal solution of (fT"3T ) po is not that of (fT2l ). then ||po|| 2 < From the complementary 
optimality condition we have A = 0. Then po = (Bi) _1 b. ■ 
Problem (fT3l) is a typical trust region (TR) subproblem in trust region optimization method. ll24l Chapter 
6.1.1] provides an efficient algorithm to achieve its optimal solution. It first checks whether ||p*(0)|| 2 < 771 
with A = 0. If so, p*(0) is the optimal solution of (fT3l) : if not, the optimality conditions are used 
directly, and the optimal Lagrange multiplier A is calculated by Newton's root-finding method from 



A 



IIp*(^)II2 = ^fl When Bi is indefinite, we solve (fT3l) with the corresponding algorithm. When Bi is 
positive semi-definite, we modify the algorithm to solve (fT2b : check whether ||p*(0)|| 2 = 771, and if it is 
not the case, calculate A by Newton's root-finding method. With this TR method we are able to solve 
(fTTb efficiently, and construct W r from x. 

For a precoder U^, we fix k G K., W n r € 1Z and {U_fc} to get the following subproblem. 

min trfX^QfeX) 
s.t. \\X\\ 2 f =Pq, 



tr(X^L fc X) = m . (14) 



Here X represents Ufc , k 6 K., and 



Qfc=E E ^"k C V qr V ql - V kr Vkl W Zfc , L k = Wf fc W rfc , 

renien \ q+k,q<^K j reiz 

v2=pL,- E Eiiw^^-^Eii^ii^ 

Let x = vec(X), Ci = Id k <8> Qfc and C2 = ld k <8> Lfc. Then (fT4)) is turned into a nonconvex QCQP: 
2 From computation point of view, ^^yip = ^" is solved instead 1241 . 
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min x Cix 
s.t. x H x = pi, 

x H C 2 x = r ?2 . (15) 

As all constraints are equalities, we use the Sequential Quadratic Programming (SQP) algorithm in 
|4ll to solve it. We set the initial point in the SQP algorithm as the precoder calculated in the previous 

iteration. As SQP converges to a local optimal solution from the initial point, we are able to guarantee 

the sufficient reduction of the objective function. 

With the above analysis, we conclude the framework of the algorithm to solve ©: 

input : initial value of k G K, and W r , r G TZ, C = 1 

output: Ufc, Vfe, k G K, and W n r G 71 

repeat 

Update decoder Vfc by solving ©, k G /C; 
Update relay beamforming matrix W r by solving (fTOt . r G 7Z; 
Update precoder Ufc by solving (fl4l) . k G /C; 
Update C as C := pj+ P w ; 
until Convergence; 

Algorithm 1: Algorithm for single stream TSTTNR model with total relay transmit power constraint 
As we guarantee sufficient reduction of the objective function in each subproblem, the objective function 
value in our algorithm will converge. However as we have separated the variables into more than two 
parts, there is no theoretical guarantee that the algorithm converges to a stationary point of ([6]). 

Remark 1: Sharing similar expression of objective function, the algorithm here for TSTINR is also 
applicable to the TLIN model in (§)• The objective function of (O is the linear combination of the 
total leakage interference plus noise P 1 + P N and the desired signal power P s , and the parameter C 
balances their weights, while the model TLIN in (9[ only minimizes 

pi + P N From the 

sum rate point 

of view, our TSTINR model is better motivated. This is verified by simulation results, where significant 
improvement of system sum rate by TSTINR compared to TLIN is shown. A similar objective function 
has been discussed in lfT4ll . where the desired signal power and the leakage interference are combined 
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and optimized. In that case the leakage interference is aimed to be aligned perfectly, thus the parameter 
C in JT4] approaches to infinity to satisfy the interference alignment constraint. In our paper, P 1 + P N 
might not be reduced to zero, and consequently C might not grow to infinity. In fffl . C is enlarged when 
the interference does not have sufficient reduction, which is different from the update strategy here. 

B. Models with individual power constraints 

In this subsection, we extend our new TSTINR model, as well as the TLIN and WMMSE model in 
0, to the ones with individual user and individual relay fixed transmit power constraints. 

1) TSTINR model: With individual user and individual relay fixed transmit power constraints, the 
TSTINR model becomes: 

max TSTINR- ^ k£!C k 



{UMV}, Ek&ciPk+Pk) 

{W} 



s.t. VfV fc = I dfc , 

HUfclll =Po,k G JC, 

£ ||W r G r fcUfc||!i + o"i||W r ||!' =Po' > r G U. (16) 

Assume there is a preprocess to carefully select active relays in the communication stage. Here we 
require all the users and relays to transmit signals with fixed power. The difference from (O is that, the 

relay sum power constraint is replaced by R individual power constraints for each relay. 

We use the objective function /({U}, {V}, {W}; C) from ([7]) in each iteration and preserve the same 
update strategy of parameter C as d8]). When applying alternating iterations, the subproblems for decoders 
Vfc, k G K are the same as (©. The objective functions in the subproblems for relay beamforming matrices 
W r , r G 1Z and precoders Ufc, k G K, remain the same as in (fTOb and (fl4l) . respectively. The differences 
are the constraints. 

With X represents W r , for any r G TZ, its constraint is: 



tr 



X G r fcG^. + a\lL r X^ 
\keic ) 



Po ■ 
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The transformed problem has the same structure as (fTTT) . and we solve it with the same method. 
For a precoder k G /C, the constraints are: 

Ilvll2 T 

II x IIf = Po> 

tr(X"wf fc W rfe X) = tiS,rGK, 

where X here represents the variable XJk, V3 = Po ~ J2 qj tk qeic l|W r gUg|||. — Cil|W r ||p. 
Then the subproblem is reformulated with x = vec(X): 

min x^Cix 

x£C M k d k« 1 

TT rp 

S.t. X x = p j 

x fl C r 3 x = ireK, (17) 

where C3 = I^ fc ® (W^W^). Similar to solving ( fT51 ), we achieve a local optimal solution of ( fT71 ) and 
sufficient reduction of the objective function by SQP algorithm. 

Remark 2: As the constraints in (fTD ) are nonlinear equations, to ensure feasibility we normally require 
that the number of variables is no less than the number of equations. Because we turn the variables 
from complex domain into real domain to solve them, the number increases to 2Mkdk- So we have 
2Mkdk > R + 1, for any k £ K, as a requirement for such problem. This limits our algorithm. The 
extension of our low complexity algorithm to constraints with power control is ongoing and future work. 

Remark 3: The algorithm in O cannot be extended to solve problems with individual relay fixed 
transmit power constraints, because it uses Semi-Definite Programming (SDP) relaxation to solve the 
subproblems for precoders. If R > 3, it may get a suboptimal solution by relaxation technique. Thus the 
objective function is not guaranteed to have sufficient reduction. 

With the adjustment to the subproblems, the basic framework of the algorithm is the same as that in the 
last subsection. As the objective function of TLIN in O is similar to that of the reformulated problem 
of TSTINR, we extend the individual relay power constraints case to TLIN with the corresponding 
replacement of the objective functions of each subproblem. 
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2) WMMSE model: Now we extend the individual fixed power constraint model to WMMSE model 
in O. The corresponding optimization problem is as follows: 

{ $$ n> E {tr[S fc (Vf F fc V fc - Vf T kk - Tf fc V fc + I d J] - log 2 det(S fc )} 
{W},{S} k£K 

s.t. Sfc^o,||u fc |||=^, fce/c, 

£ II W r G rfc U fc ||£ + of ||W r ||| = Pq, V G ft, (18) 

fce/c 

with F fc = T fcfc T^. + F fc . And Sfc G £, dkXdk , A; G /C are the weight matrices. In this approach, we try to 
minimize the mean square error. |[26l shows that, if we use the linear MMSE receiver filter, (TT8T ) shares 

the same stationary points with the sum rate maximization problem. 

We also apply the alternating minimization algorithm to solve (TT8T ). From the above analysis, W k , for 

all k G /C, are set as MMSE filter. With fixed XJ k , for all k G 1C and W r , for all r G TZ, we have 

V fc = F^T fcfc . (19) 

Take the partial derivative of the objective function of (PT8l) with respect to and set the expression be 
zero. Then we obtain (1201): 



S k = [Vf F k V k - Vf T kk - T&V* + I.J- 1 = l dk + Tf fe F- x T fefc . (20) 

Fixing all other variables, the subproblem for relay beamforming matrix W r , for any r G TZ, represented 
by X is expressed as follows: 



x f}* Lr ^tr[X(J2G rq G^ q + all Lr )X H VgS k V kr ) - 2Re[£££tr(XG n? G£wf Vf S fe V fer )] 

s.t. tr[X(£ G rfc Gf fe + <7?I Lr )X H ] = p* (21) 

fce/c 

By applying x = vec(X), ((2Tb is transformed into a QCQP similar to (fTTT) . We use the same method to 
solve it. 

For a precoder XJ k , for any k G /C, while fixing all other variables, the subproblem becomes: 
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XgC M fe xd fc 



tx(J2 S fe V fer W rfc X) 



r£lZ 



A* tt X " [ E ( E V^fs^ V gJ W ifc )] X I - 2Re 
s.t. ||X||J.=j#, 

tr(X H W^W rfc X) = r^r G 72, (22) 

where X represents U^. With x = vec(X), we transform it into a QCQP similar to (fTTT ) and solve it 
efficiently by SQP method. 

The algorithm to solve the WMMSE model with individual user and individual relay fixed power 

constraints is as follows: 



input : initial value of Ufc, k G K and W r , r G 72. 
output: Ufc, Vfe, Sfc, fc G /C and W r , r € 72 
repeat 

Update decoder and weight matrix by ( fl9l ) and (l20b . A; G /C; 
Update relay AF matrix W r by solving (1211 ). r G 72; 
Update precoder Ufc by solving (|22l . G /C; 
until Convergence; 

Algorithm 2: Algorithm for WMMSE model with individual relay fixed power constraints 



C. Computational complexity analysis 

In this subsection, we compare the computational complexity of the algorithm for our new model 
TSTINR with that of the algorithm in [3 for the WMMSE modej^, both with per user and total relay 
transmit power as representation. As both algorithms consist of three main parts of subproblems, we 
analyze them individually. 

First, we consider the complexity for solving decoder V^, as well as the weight matrix in WMMSE, 
k G TC. The construction for the matrix A in (O of TSTINR has the same computations as that for 
and F k to solve V fc and S fc of WMMSE, similar to <QH> and <[20]>. Besides, TSTINR requires 9M 3 

3 The algorithm for the WMMSE model and that for the TLIN model in (9) are similar. Thus we only analyze WMMSE as 
a representative. 
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operations for eigenvalue decomposition of A; WMMSE requires 9M 3 + 9M 3 = 18M 3 operations for 
eigenvalue decompositions of Ffc and F&. Thus in the first part, TSTINR requires less complexity than 
WMMSE. 

Second, we focus on the part for the relay beamforming matrix W r , with any r € TZ. Both algorithms 
require similar complexity to construct the corresponding subproblem. To solve subproblem (TTTT >. our 
TSTINR algorithm applies the TR method, which is mainly the Newton's root finding method for A. It 
requires only a few inner iterations to find the optimal A and in each inner iteration the main calculation 
is the QR factorization, with complexity of (9((L 2 ) 3 ) = 0(L 6 ). To the contrary, the WMMSE algorithm 
in (9| applies SDP method, whose complexity is 0((L 2 ) 6 ) = 0(L 12 ). The complexity in the second part 
of TSTINR is much less than that of WMMSE. 

Third is the part for the precoder Ufc, with any k £ /C. TSTINR solves (fl3T ) by SQP method, which 
complexity is 0((Md) 3 ) = 0(M 3 d 3 ). WMMSE solves a QCQP with the same structure but with inequal- 
ity constraints. And it applies SDP relaxation method, with complexity of 0{(Md) & ) = 0(M e d 6 ). With 
similar computations to construct the corresponding subproblem, TSTINR has much lower compelxity 
than WMMSE in the third part. 



Complexity comparison for each subproblem in one iteration 


TSTINR 


WMMSE 


1. Vfc and Sfc, for any k £ K, 


9M 3 


18M 3 


2. W r , for any r € 1Z 


0(L») 


0(L rz ) 


3. Ufc, for any k € fC 


0{M :i d 3 ) 


0(M e d B ) 



TABLE I 

Computational complexity analysis for each subproblem in one iteration 



The complexity differences of each subproblem in one iteration between the TSTINR and the WMMSE 
model are listed in Table 1. From the comparison of the three main parts of the two algorithms, we 
conclude that our proposed new algorithm for TSTINR enjoys lower complexity than the algorithm for 
WMMSE in O. Similarly, it is analyzed that our new algorithm for TSTINRQ has lower complexity than 
the algorithm for TLIN in O. With numerical evidence, it turns out that the compared algorithms have 



Our proposed algorithm is also applicable to the TLIN model, which has similar complexity as that for TSTINR. 
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similar number of iterations to solve the problem. 

Remark 4: Because the SQP method, which solves (031 ). (fTD ) and (l22l ). can only deal with equality 
constraints, we restrict our models to fixed power constraints. Although the models with power control 
constraints are more general and practical, the application of SQP method saves computational complexity. 
Even with fixed power constraints, our TSTINR model performs better than the WMMSE model with 
power control constraints in medium to high SNR scenarios, as shown in Section [V] 

IV. Multiple stream model 

In this section, we study the multiple data stream model, with the purpose to maximize the system 
sum rate. It is pointed out in [[27) that multiple data streams, corresponding to multiple DoFs, help to 
increase the capacity of single-hop network in medium to high SNR. This conclusion can be extended 
to the two-hop case, by treating the network as an equivalent single-hop network between users and 
assuming the same power at the relays and the transmitters. First, we analyze the achievable number of 
data streams of the models from Section III. Then our proposed TSTINR model and the corresponding 
algorithm are modified to support multiple data streams for each user pair. Here all the models include 
per user and total relay transmit power constraints. 

A. Analysis of single stream models 

The dimension dk of the transmit signal s^, is expected as the achieved number of data streams at 
User k. However, in simulations when we apply our TSTINR algorithm to the system with dk > 1, we 
always observe that the system precoder has rank one. This implies that with linearly dependent 
columns of each precoder we can only achieve one data stream for each user pair, regardless of dk- 
Similar phenomena are observed for the TLIN and WMMSE algorithm in |9). 

The following theorem provides theoretical evidence for the phenomena of the TSTINR and TLIN 
models: 

Theorem 4: In our proposed TSTINR model and in the TLIN model from J91, the subproblem for 
precoder Ufc always has a rank one optimal solution, regardless of dk- 
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Proof: Define Y = X.~K H and drop the rank constraint of rank(Y) < the subproblem ([T4l in 
TSTINR is relaxed to the following semi-definite programming: 

mm tr(YQ fc ) 

s.t. tr(Y) =p^,tr(L fc Y) =772. (23) 

In ll30l Theorem 4.1], it is shown that d23l always has a rank one optimal solution. Suppose Y* = 

y*(y*) H is the optimal solution of (|23l . Let X* = [aiy*, a 2 y*, . . . , a^y*] with a* G R, X^f=i a i = 

1. That is, the columns of X* consist of aiy*,i = l,...,d k . From the fact that tr [(X*) if Q fe X*] = 
£*l a ? (y*) H QkY* = tr(Y*Q fc ), for any feasible X of d, we have: 

tr(X"Q fc X) > tr(Y*Q fc ) = tr[(X*) H Q fc X*]. 

Thus we conclude X* is an optimal solution of (fT4l . Because the subproblem for precoder in TLIN 
from (9) has the same structure as (TT4l . we conclude the same result for TLIN. ■ 

Remark 5: Theorem @] shows that based on the structure of the subproblem, the optimization always 

has rank one precoders in TSTINR and TLIN as solutions. Simulations verify this behavior in all cases. 

The same phenomenon is observed for the WMMSE model of @ whereas the conclusion of Theorem 

[4] cannot be extended to WMMSE, due to the extra linear term in the objective function of the precoder 

subproblem. Therefore, with the existing models we can only achieve a single data stream for each user 

pair. Thus a new model should be proposed to achieve multiple data streams. 

B. Multiple stream TSTINR model 

In this subsection, we propose the new model based on the TSTINR model in Section III to support 
multiple data streams. Sufficient motivation for the construction of the new model is also provided. 

1 ) Analysis of user transmit power allocation: To achieve the required number of parallel data streams, 
we should have independent columns of precoder for all k 6 K. Without loss of generality we require 
the columns of to be orthogonal. Whereas there is a transmit power constraint (l6cl for each user 
in ©, we have the power allocation among parallel data streams for User k. First, we modify our 
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TSTINR model as follows: 



max TSTINR 



{W},{<j>} ^fce^-v k k 

s.t. U^U fc = * fc , Vf V fc = € /C, 

5^(X)ll W rGrfcUjfe|||. + ff?||W r |||.) < 
re^ fcg/C 



tr(*fc) < Po > *fe is diagonal, ^ 0, k € K. (24) 



Here is a 4 x <4 diagonal positive semi-definite matrix, which contains the data stream power 
allocation variable of User k. 

From the optimization point of view, the feasible set of precoder is restricted to have orthogonal 
columns, comparing with that of ([6]). This avoids the phenomenon observed in the solution of ((6]) that all 
columns of the rank one precoders U& are nonzero but linearly dependent. Different from what has been 
mentioned in Theorem 01 here the rank one case of only happens when one diagonal element of 
is nonzero, which result in all columns of U/t but one are all zeros. Hence, we focus on the analysis 
of the subproblem to solve as well as The reformulation of the objective function of d24b and 
the update strategy of the parameter C are similar to the algorithm to solve Q. Given k G /C, fixing all 
variables other than U^. and the precoder subproblem becomes: 

min tr(X^Q fc X) (25a) 

s.t. X H X = <£ fc , (25b) 

tr(X^L fc X) < 7/ 2 , (25c) 

tr(*fc) < Pq , *fc is diagonal, y 0, (25d) 

where X represents 

Theorem 5: The optimal of ( [231 ) is of rank one, i.e., there is only one positive element on the 
diagonal of <&fc. 

The detailed proof is shown in Appendix-C. As described in Theorem [5j at the optimal solution of d25l ) 
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the complete transmit power should be assigned to one data stream. This leads to rank(Ufc) = 1 and 
thus only one data stream can be transmitted for each user. 

2) New model and the algorithm framework: In our new model for multiple stream case, without 
transmit power optimization, we assume each user has fixed transmit power Pq , and require equal power 
allocation among parallel data streams for each user. This choice accords with the optimal power allocation 
scheme to maximize the system sum rate in the high SNR scenario ll28l . Suppose User k has dk parallel 
data streams, and the corresponding optimization problem of the new model becomes: 

max TSTINR = — ^ k , eK r k Afx 

{W} 

T 

s.t. uf \Jk = ^i dk , vf v fc = i dk ,keic, 

^(^||W r G rfc U fc ||^ + a 1 2 ||W r ||^) <p«a X . (26) 
ren keK. 

Similar to ©, we reformulate the objective function of (l26l ) with parameter C, which adopts the update 
strategy ®, and becomes /({U}, {V}, {W}; C) = C(P 7 + P N ) - P s . We apply the alternating 
minimization method to solve precoders, decoders and relay beamforming matrices in the reformulated 
problem. The subproblems for decoders , k 6 K, are the same as Q in Section IIII-AI 

For any r € TZ, the subproblem for W r while fixing all other variables is reformulated as (fTTb with 
inequality constraint. Then it is equivalent to the typical trust region subproblem (fT"3T ). and solved by TR 
method in l24l . 

For the precoder Ufc, the corresponding subproblem becomes following, while fixing ~V q ,q G /C, 
W r ,r € Tl and {U_ fc }: 

min trfX^QfcX) (27a) 

T 

s.t. X^X = ^I dfc , (27b) 
tr(X H L fc X) < m , (27c) 

where X represents the variable Ufe, and and 7/2 are mentioned just after the equation of (fT4l ). 
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Before presenting the algorithm to solve d27l) . we first show its global optimality conditions: 

Theorem 6: The global optimality conditions for (|27T ) are stated as follows. There exists fi* > as 
the Lagrange multiplier of (I27cb . such that: 

OCl X*0*) = ^min(Qfc + ^* L fc) is the optimal solution for: 

min ti[X H (Q k + fi*L k )X]. (28) 

XHX =f 

OC2 Complimentary condition holds: //{tr^X^LfcX*] - r/ 2 } = 0. 
OC3 c(/j) as the function of [i satisfies (I27cl) : 

c(/i*)=tr{[X*( / i*)] ff L fe X*(/i*)}<7 ?2 . 
Proof: Suppose there exists Xo as a feasible point of (ETJ), which satisfies: tr(X^Q fe X ) < tr[(X*) H Q fc X*]. 
If fi* = 0, then from OC 1 , X* is the global optimal solution of the relaxed problem of (l27l) by dropping the 
constraint (l27cl Thus for X as any feasible point of ([27]), it holds that tr[(X*) H Q fc X*] < tr(X H Q fc X), 
which contradicts the assumption. 

If /i* > 0, then tr[(X*) H L fe X*] = i] 2 holds from OC2. Thus we deduce /i*tr(X^L fe X ) < fJ.*rj 2 = 
/i*tr[(X*) H L fe X*]. Then we have tr[X^(Q fc +/x*L fc )X ] < tr[(X*) H (Q fc + /i*L fc )X*], which contradicts 
the fact that X* is the global optimal solution of (128T ). For both cases we have proved that the assumption 
for Xo does not hold. Thus X* is the global optimal solution of (|27T ). ■ 
With eigenvalue decomposition, we obtain the optimal solution of (|28T ) as X*(/i*) = \f^ iy '^i n (Qk + 
fi*~L k ). We want to obtain /i* > 0, to satisfy OC2 and OC3. If c(0) < r] 2 , then /i = is the optimal 
Lagrange multiplier. Otherwise /i = cannot be optimal and ^i* should be strictly greater than 0. Thus 
we should always have c(fi*) = tr[(X*) H L fc X*] = ?? 2 from OC2. Also from the constraint (I27cb we 
should have c(oo) < ?] 2 for a feasible problem. Then with c(/j,) as a continuous function, there exists 
fi* € (0, oo) such that c(/u*) = ?] 2 . Thus we use Newton's root finding method ll24l to search for /i*. 
The algorithm to solve subproblem (1271 ) is summarized as follows: 
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input : /i = 




output: the optimal solution of (|27]>: X*(/i*) = \ 


(J/lc 111111 x v ' ' 


if c(fi) < 7] 2 then 




Set n* = 0; 




else 




Apply Newton's root finding method to solve 


c(/x*) = rj 2 ; 


end 





With the methods for all the three subproblems, the algorithm for (|26l ) is presented as Algorithm 3: 



input : initial value of U^, k € /C and W r , r G K, C = 1 

output: U fc , V fe , /c € /C and W r , r 6 

repeat 

Update decoder by solving ©, /c <E /C; 

Update relay beamforming matrix W r by solving (fTOl ) with inequality constraint, r € TZ; 
Update precoder by solving (T27T ). k £ JC; 
Update C as C := pi P +pN ; 
until Convergence; 

Algorithm 3: Algorithm for multiple stream TSTINR model 



By enforcing the orthogonality constraints to the columns of each precoder, it is guaranteed that 
rank(Ufc) = dk,k € /C. Thus each user pair has df. parallel data streams as expected. 

V. Simulations 

In this section, we evaluate the performances of our proposed algorithms. Simulations include two 
parts, where the single stream and the multiple streams models are analyzed, respectively. In both parts, 
each element of G r k and Hfc r ,fc G /C,r S TZ are generated as i.i.d complex Gaussian distribution with 
zero mean and unit variance. The noise variances are set as a\ = a\ = a 2 = 1. Initial values of 
Ufej/c £ K, and W r ,r € TZ are randomly generated, and scaled to be feasible. Initially, the parameter 
in TSTINR model is set as C = 1. For each plotted point, 100 random realization of different channel 
coefficients are generated to evaluate the average performance. 

Here we define SNR as SNR= % = and p^ax = R ' Po- We use system sum rate R sum as the 
measure of QoS. We use Matlab simulation server R2010a based on an INTEL Core i7-875K CPU with 
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8 GB RAM with the operating system 64 bits Debian Linux to run all the simulations. 
A. Single stream models 

In this subsection, we consider a (4 x 2, l) 4 + 2 4 MIMO relay system, which means = 4, = 
2, L r = 4,dk = 1, for all k € K and r £ 1Z, and K = R = 4. First, we analyze models with per user 
and total relay power constraints. Our algorithm for TSTINR model is compared with those of the TLIN 
model and the WMMSE model with power control in J9). Here we use the Sedumi toolbox to solve 
the subproblems with SDP algorithm in WMMSE. For the TLIN model, we apply similar algorithm as 
proposed for TSTINR to speed up. 

Fig. 2 and Fig. 3 depict the achieved sum rate and the computing time with respect to different 
SNR values for the three algorithms, respectively. From the aspect of the achieved sum rate, TSTINR 
outperforms TLIN for general SNR values as expected, and outperforms WMMSE in medium and 
high SNR scenarios. Furthermore, the computing time of TSTINR and TLIN are much less than that 
of WMMSE, which accords with the computational complexity analysis. By numerical evidence, the 
computing time of WMMSE increases with increasing SNR value, mainly because the rank-one solution 
is less frequently observed for SDP, and the relaxation technique |[25l has to be used. 

Then we compare the algorithms for models with individual user and individual relay power constraints. 
Fig. 4 depicts the average achievable sum rate by our proposed algorithms for TLIN, TSTINR and 
WMMSE model with fixed transmit power, as well as the algorithm in for the WMMSE model with 
power control. It is shown that WMMSE with power control enjoys higher sum rate than WMMSE with 
fixed power constraints generally. Thus system sum rate benefits substantially from power control. Even 
so, our proposed TSTINR with fixed power constraints outperforms WMMSE with power control for 
medium to high SNR scenarios. Also, TSTINR outperforms TLIN in general. Here the complexity as 
well as the computing time of the TLIN, TSTINR and WMMSE with power control behave similar to 
the algorithms in Section UlI-AI 

The comparison with both total relay and individual relay constraints cases are similar. In both Fig. 2 
and Fig. 4 the curves representing TSTINR and WMMSE with power control have intersection point at 
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■ B -TLIN 
-0— TSTINR 

■ 3|{ 1 1 WMMSE with power control 




Fig. 2 Average achievable sum rate versus SNR, 
total relay power constraint 



* 




B - TLIN 
0— TSTINR 

1 WMMSE with power control 



5 10 15 20 25 30 35 40 45 
SNR (dB) 

Fig. 3 Average computing time versus SNR, 
total relay power constraint 



about SNR= 30dB. With SNR value higher than 30dB, our proposed TSTINR algorithm outperforms the 
WMMSE algorithm with power control. The performance confirms the analysis: in low SNR scenarios 
MMSE receiver filter are almost optimal considering linear filter, while in high SNR scenarios the receiver 
filter should be close to zero-forcing solution, which can be achieved by TSTINR. 
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B. Multiple stream model 

In this subsection we consider multiple stream models of the MIMO relay networks. We investigate 
three kinds of 2 x 2 x 2 networks with different number of antennas, that is, K = R = 2.. Here for the 
scheme d\ = d,2 = 1 we choose the maximum system sum rate results between our TSTINR model and 
the WMMSE model with power control in 

First we consider a network with 2 antennas for each user and 4 antennas for each relay. The number 
of data streams for User k, df., varies from 1 to 2 for both k = 1,2. For different choices of c2&, k = 1, 2 
the average sum rate results corresponding to different SNR values are shown in Fig. 5. As expected, 
in the low SNR scenario the single stream scheme with d\ = = 1 outperforms other schemes; in 
medium to high SNR scenarios, the scheme d\ = c?2 = 2 becomes dominant and the scheme d\ = d% = 1 
performs worse than all others, in terms of sum rate. 

In the second example, the considered network has the same parameters as the previous one, except 
that each relay owns 2 antennas. Similar to Fig. 5, the average achieved sum rate results corresponding to 
different SNR values for different requirements of data streams are shown in Fig. 6. The curves are quite 
different from the previous example. Here the schemes d\ = 1, c?2 = 2 and d\ = 2, d^ = 1 outperform the 
other two schemes in medium to high SNR scenarios. And generally the scheme d\ = d2 = 2 performs 
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Fig. 5 Average achievable sum rate versus SNR, = Afc = 2, = 4 
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Fig. 6 Average achievable sum rate versus SNR, 



Nu = Li 



very bad. 

The performances shown in Fig. 5 and Fig. 6 indicate similar behavior as the recent theoretical result 
on DoF of MIMO relay networks. From the cut-set bound theory, the maximum DoFs of the first network 
is no greater than 2 for each user PTl Theorem 15.1]. And simulations verify the benefit to transmit 2 
data streams for each user over other schemes in Fig. 5. However in the second example, there is no extra 
relay antenna to align interference besides transmitting the desired signal. Without symbol extension or 
time division, 2 DoFs for each user is not achievable. This accords with the performances in Fig. 6. In 
ifTTl the authors show that | DoFs for each user are achievable in 2 x 2 x 2 network with 2 antennas 
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for each user and each relay. And the essential idea of the transmission scheme is to sacrifice one data 
stream for interference and make full use of all other streams. Correspondingly, in Fig. 6 for the medium 
to high SNR scenarios the schemes d\ = 1, di = 2 and d\ = 2, di = 1 perform the best. This indicates 
substantial benefit of system sum rate from such schemes. This is also verified in the third example with 
3 antennas for each user and each relay. The comparison of different data stream schemes are depicted 
in Fig. 7, where the scheme d\ = 2, d^ = 3 achieves the highest sum rate among all the schemes in 
medium to high SNR. In general, multiple stream schemes improve the system sum rate in medium to 
high SNR scenarios. 
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Fig. 7 Average achievable sum rate versus SNR, = iVj. = = 3 



VI. Conclusion 

This paper considered the general KxRxK MIMO relay network. With per user and total relay power 
constraints, an algorithm for the TSTINR model was proposed. For the individual user and individual 
relay fixed power constraints, the TSTINR algorithm was extended, and the algorithms for the TLIN and 
WMMSE model in (9l were also modified to solve the problem with such constraints. Computational 
complexity analysis showed that our proposed algorithm for TSTINR has much lower complexity than 
the WMMSE algorithm from (9l. Focusing on per user and total relay power constraints, we proposed a 
multiple stream TSTINR model, to overcome the disadvantages of the previous models that only single 
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data stream can be transmitted for each user. Simulations showed that for single stream case TSTINR 
algorithm performs better than WMMSE in medium and high SNR scenarios for both constraint cases, 
and outperforms TLIN generally, in terms of achievable sum rate; the system sum rate significantly 
benefits from the multiple data stream transmission in medium to high SNR scenarios. 

VII. Appendix 

A. Proof of Theorem [7] 

First we introduce two lemmas for the proof of Theorem Q] 
Lemma 1: ||29l Theorem 3.2.2] 




B ^ 

■till tS 12 

y B21 B22 j 



y 



are two Hermitian matrices with the same dimensions. An y and Bn >- also have the same 
dimensions. Then 

det(A + B) > det(A) det(B) > det(B) 



det(An +Bn) ~ det(An) det(Bn) ~ det(Bn)' 
So this derives: 

det(A + B) > det(An+Bn) 



det(B) ~ det(Bu) 

Lemma 2: j29l Theorem 6.8.1] Suppose C,B are two Hermitian matrices. C >z B y 0. Then the 
following inequality holds: 

det(C) > tr(C) 



det(B) - tr(B)' 

Let A k = T kk T% k , B k = E q eic,^k Tfc ff TjJ + ff? £^ H fcr K^ + <rf I* and C k = A k + B fc , k G K. 



Then we have 



1 + TSTINR = 1 + ^ keK llt k Z k = fi^V^V = £ ^ det(B-C fc ). 

From the definitions we conclude that B^ y 0, C k y and C k y B k , k G fZ. 
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1. We prove that for any k G /C, 



det(B^C fe ) > det[(Vf BfcVfc)" 1 ^ C fc V fc )]. 



(29) 



For simplicity we omit the subscript k. Let V_l G ^Nx(N-d) ^ e t ^ e ^ ases D f t h e complementary subspace 
of the subspace spanned by the columns of V. That is, Q = [V, VjJ 6 C NxN is a unitary matrix. Then 
we deduce that 

Hrt-m-in - det ( QHcQ ) > det ( vHcv ) 

CTl ^ j det(Q^BQ) " det(V^BV) ' 

Here the equality is deduced from the property of unitary matrix Q. And the inequality comes from 
Lemma 1, C = A + B and that 



Q^CQ 



' V H CV V^CVi 



\ vf cv vf cv ± J 

2. C fc b B fe induces Vf C^V*,. >r V^B fc V fc . Lemma 2 shows that 



det(V^C fc V fc ) > tr(Vj»C fc V fc ) 



det(Vf B fc V fc ) " tr(Vf B fc V A 



(30) 



From ([29]) and (|30l) . it is concluded that 



i?sum = J]log 2 det(B- 1 C fe ) > J]log 2 det[(Vf BfcVfe)" 1 ^ C fc V fc )] > S^g ^ fc | .(31) 

tcr fccr fcG A; tr ( v fc 11 



3. Finally we prove that: 



v tr(VfC*V*) E fcg ctr(VfC fc V fc ) 
^ 82 tr(VfB fe V fc ) " g2 E fc ^tr(VfB fc V fc )- 



fce/c 



With any scalar f fc > 1 and the fact that tr(V^B fc V fe ) > 0, k G /C, it is deduced that: 



(32) 



(II^E^fcBfcVfc) > £t fc tr(Vf B fc V fc ). 

fce/c fce/c fce/c 

Let tfc = "(v^BfcVfc) ' ^ e ^> divide ^ fcg £ tr (Vj^BfcVfc) > and take logarithm for both sides, and thus 
we have d32"l ). Combining ( f3TT > and d32l we prove Theorem Q] 
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B. Proof of Theorem [2] 

Let {X} represent the set of the iterative points {{U}, {V}, {W}}. Suppose {X*} are the feasible 
points achieved from the ith iteration. Define P I+N = P 1 + P N . Then the expression of parameter C 
used in the ith iteration as well as the TSTINR achieved in the (i — l)th iteration is: 

& = TSTINR" 1 = p7+%x^T) - 
As in the ith iteration there is sufficient reduction of (17al ), it holds that 

/({X*}; C*) = C*P J+JV ({X*}) - P S {{y?}) < /({X^ 1 }; C*) = 0, 

Then TSTINR* = P S { *x'}) - ^ = TSTINR^ 1 . Thus the value of TSTINR increases monotonically. 

Suppose {X*} = {{U*}, {V*}, {W*}} are the stationary points of © and A G R is the Lagrange 
multiplier of GsD: ^({X}) = Ere7e(Efc G /c ||W r G rfc U fc || F + a?||W r || F ) ~ p£a* = 0. Then the first 
order optimality conditions of the problem (O with respect to W r are: 

^ ap'+"({x*» dP s ({x*}) dhj^*}) _ n 

6 sw; sw; A sw r - 0j (33) 

h({X.*}) = 0. (34) 

When the iterative points converges to {X*}, we have C = pi+N^-^)^ ■ Taking it into (l33l , and let 

A = ~ pj+jv({x*}) ^" Then we have 



With A as the Lagrange multiplier of (I6dl ). (1331 ) and (1341 consist of the first order optimality conditions 
of © with respect to W r . Similarly, we are able to achieve the first order optimality conditions of © 
with respect to other variables. Thus {X*} are also the stationary points of ©. 
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C. Proof of Theorem [5] 

Similar to Theorem [6] for problem d27"T ), the optimality conditions for d25l ) are as follows: 

1. X*(9*), <$>l(9*) are the optimal solutions for the problem below: 

min tr[X H (Q fc + rL fc )X] 

s.t. X H X = $ fe , 

tr(*fc) < p T , is diagonal, ^ 0. (36) 

2. Complimentary condition holds: 9*{tr[(X.*) H L k X.*} - rj 2 } = 0. 

3. c(0*) satisfies (|25c]>: c(0*) = tr{[X*(r)] H L fc X*} < rj 2 . 

Here #* > is the optimal Lagrange multiplier of (I25cb . With 0* we are able to obtain the optimal 

X* and 3>£ of (125T ). If the ith diagonal element of 3>fc is zero, then the ith column of X should be 0. In 

this situation we can delete this column and optimize the remaining ones. Without loss of generality we 

assume each element of the diagonal of 3>fc is strictly positive, that is, 3>fc y 0. Let R = + jU*Lfc, 
_ i_ 

and Y = X3> fe 2 . Omit the index k for simplicity. Problem (l36l ) is equivalent to the following problem: 

min tr(Y H RY*) (37a) 

YGC s - rxd ,*ec dxti 

s.t. Y^Y = la, (37b) 

tr(<&) < p^ , 3? is diagonal, <I> >z 0. (37c) 

In the following we analyze the optimal solutions of (l37l to show the property of Y* and <&*. Here we 
treat the columns of Y as the linear combination of the eigenvectors of R. Let t\ < t 2 < . . . < tM be 
the eigenvalues of R. Suppose y^, the ith column of Y is the linear combination of the eigenvectors 
corresponding to the eigenvalues {tj,j G flj}, where Jlj C {1,2, ...,M}. Then according to (I37bl ). 
different columns correspond to different eigenvalues of R, that is, P\f =l Qi = 0. Then the feasible set is 
divided into d independent parts with yj as variables, for i = 1, 2, . . . , d, respectively. Define <&jj as the 
2th diagonal element of 3>. The objective function of (l37l is rewritten as Yli=i ^uyf^yi- F° r anv fixed 
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feasible y 0, the optimal solution of (l37l ). y*, is the eigenvector of matrix R corresponding to the 
eigenvalue ij = min{tj,j € Ui}, as long as > 0, for any i = 1,2, ... ,d. Then taking these solutions 
back into (I37T ). the problem becomes: 

d 

min 

U,*u,i=l,...,d 

s.t. > 0,i = 

i=l 

tj is an eigenvalue of R, ii ^ j = 1, . . . , d. (38) 

As the objective function of (|38T ) is linear in for any i = 1, . . . , d, the optimal solutions of (1381 ) should 
be $^ = > $ii = 0, i = 2, . . . , d and = ti. Thus, the optimal solution of (j25]) is of rank one. 
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